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PREDICTING  DATABASE  REQUIREMENTS  FOR  GEOGRAPHIC 
INFORMATION  SYSTEMS  IN  THE  YEAR  2000:  LONG-TERM 
DESIGN  ISSUES  FOR  GRASS 


1  INTRODUCTION 


Background 

In  1983  the  U.S.  Army  Construction  Engineering  Research  Laboratory  (USACERL)  Environmental 
Division  (EN)  created  the  Geographic  Resources  Analysis  Support  System  (GRASS),  a  computer-based 
geographic  information  system  (GIS).  Since  that  time  GRASS  has  been  under  continuous  development 
to  address  the  Army’s  increasingly  complex  land  analysis  needs. 

Over  the  years,  the  use  of  GRASS  has  increased  in  the  public  and  private  sectors  to  assess 
environmental  impacts,  evaluate  site  suitability,  detect  change  over  time,  manage  cultural,  historical,  and 
natural  resources,  and  model  the  effects  of  environmental  phenomena.  These  increased  demands  on 
GRASS  have  resulted  in  the  recognition  of  several  limitations  in  the  system  as  currently  implemented. 
Limitations  in  the  software  include  the  lack  of  support  for  aspatial  data  associated  with  spatial  entities, 
the  lack  of  a  user-friendly  graphical  interface,  the  lack  of  vector  analysis  capability,  and  the  inability  to 
store  anything  other  than  integer  data.  Some  of  these  limitations  could  be  addressed  by  upgrading  the 
current  GRASS  system  or  by  integrating  other  software  in  a  loose  manner.  A  better  approach  might  be 
to  redesign  and  develop  a  new  generation  of  GRASS  that  addresses  these  limitations  while  preserving  the 
successes  of  the  current  systems. 

The  initial  design  of  GRASS  in  the  early  1980s  targeted  hardware  that  would  be  available  in  the 
near-term  future— 5  to  10  years— and  such  hardware  did  become  popular  in  the  workplace.  Machines  now 
on  the  drawing  board  will  bring  more  power  and  new  capabilities  at  a  lower  cost  to  the  workstation 
community  to  which  GRASS  will  remain  targeted.  These  hardware  environments  will  open  wide  the 
doors  to  greater  software  potentials.  As  software  has  long  been  limited  by  hardware  capability,  these 
potentials  are  also  defined  and  limited  by  underlying  data  structures  and  data  storage  techniques.  It  is 
important,  therefore,  to  define  such  structures  and  techniques  in  a  way  that  allows  for  the  most  effective 
exploitation  of  the  hardware  environment. 


Objective 

The  objective  of  this  study  was  to  develop  recommendations  for  addressing  some  of  the  weaknesses 
in  current  GISs— specifically  in  GRASS— including  discussion  of  how  the  recommendations  might  affect 
software  capabilities  and  possible  implementation  strategies  for  achieving  those  capabilities. 


Approach 

The  authors  reviewed  the  recent  literature  in  the  area  of  next-generation  database  management 
systems  (including  object-oriented  database  management  systems  [OODBMS]  and  third-generation  or 
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extended  relational  database  management  systems  [RDBMS]).  In  addition,  literature  in  four  areas  of 
software  capability  were  examined,  especially  in  reference  to  GISs: 

•  Image  processing  systems 

•  Spatial  modeling  and  analysis 

•  Map  data  input,  output,  and  editing 

•  User  interfaces. 

Literature  detailing  the  application  of  advanced  hardware  to  these  areas  of  capability  was  examined. 
Limitations  and  problems  that  would  most  likely  present  themselves  when  implementing  these  new 
capabilities  were  identified. 

The  research  is  presented  in  two  parts:  (1)  software  capabilities  and  requirements,  including  a 
description  of  one  possible  future  GIS  environment  for  reference,  and  (2)  implementation  strategies. 
Several  areas  of  necessary  research  were  identified. 


Scope 

This  report  presents  results  of  a  preliminary  investigation  into  the  problems  discussed  in  the 
Background  section.  It  is  not  intended  to  answer  every  question  related  to  correcting  the  limitations  of 
GRASS,  but  instead  presents  one  possible  solution,  implementation  strategies,  and  recommended  directions 
for  further  research. 


Mode  of  Technology  Transfer 

The  information  presented  in  this  report  is  being  evaluated  for  incorporation  into  future  versions  of 
GRASS.  Resulting  technologies  will  be  implemented  in  the  software  targeted  at  the  general  GRASS  user 
community.  Technology  transfer  will  also  be  supported  through  the  updating  of  user  documentation  and 
services  offered  through  the  GRASS  Information  Center  at  USACERL. 
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2  GIS  SOFTWARE  CAPABILITIES  AND  REQUIREMENTS 


The  Need  for  Innovation 

GISs  require  the  storage  of  large  amounts  of  data  which  must  be  provided  to  multiple  users  on 
demand.  GIS  users  come  from  varying  backgrounds  and,  thus,  have  differing  data  manipulation  needs. 
Cartographers,  for  example,  might  want  to  combine  information  from  several  sources  to  construct  maps 
that  convey  a  desired  message.  Land  managers  may  use  the  data  to  develop  large  tracts  of  land  for 
management  plans  based  on  the  characteristics  of  the  land  and  the  activities  being  planned.  Environmental 
scientists  may  model  physical  phenomena  using  GIS  data  as  inputs,  and  display  functionality  to  visualize 
results.  Engineers  may  want  to  integrate  the  models  developed  by  scientists  into  simulations  both  of 
interactions  between  physical  processes  and  effects  of  these  interactions  on  their  designs.  Individuals  in 
many  fields  may  wish  to  investigate  previously  undiscovered  relationships  between  different  spatially 
referenced  data.  From  these  few  examples  it  is  obvious  that  a  GIS  must  provide  a  wide  range  of 
functionality.  To  effectively  apply  GISs  to  their  problems,  users  need  new,  innovative  systems  that  will 
provide  them  with  the  tools  for  constructing,  navigating,  and  processing  geographic  databases.1 

Additionally,  aspatial  data  is  commonly  used  in  conjunction  with  spatial  data.  An  environment  that 
supports  all  of  the  activities  cited  in  the  previous  paragraph  must  also  handle  all  types  of  data  and  the 
relationships  among  those  data.  Next-generation  applications  (e.g.,  computer-aided  design  and 
manufacturing  systems,  multimedia  and  hypermedia  information  systems,  expert  systems)  require  databases 
that  can  support  the  requirements  of  these  applications.2  GISs  fall  into  this  category,  and  it  is  obvious 
that  future  GIS  environments  must  be  based  on  object-oriented  database  management  systems. 

Perhaps  the  most  interesting  plight  is  that  of  the  engineer  trying  to  model  interaction  between 
physical  systems  using  a  GIS.  Since  this  is  among  the  most  demanding  kind  of  work  engineers  currently 
do  on  GISs,  the  current  research  has  focused  on  ways  to  improve  GISs  to  facilitate  such  complex 
modeling  tasks.  In  the  following  pages,  the  authors  outline  an  idealized  system  that  might  address  the 
specific  weaknesses  of  current  GISs.  This  idealized  system  is  offered  as  a  frame  of  reference  for 
subsequent  discussions  throughout  this  report. 

One  Possible  Future  GIS  Organization 

A  GIS  must  consist  of  a  toolbox  that  includes  functional  modules3  that  operate  on  specific  data 
types  (or  “objects”).  This  requirement  stems  from  the  fact  that  engineering  tasks  frequently  can  be  boiled 
down  to  operations  on  complex  physical  objects  that  can  be  expressed  in  terms  of  their  state  variables, 
parameters,  and  operations  on  these  variables  and  parameters.  Examples  of  such  a  modular  concept  might 
include: 

•  A  module  for  performing  algebra  on  a  set  of  raster  images 

•  A  module  for  performing  a  shortest  path  analysis  on  a  network  of  arcs  and  nodes 

•  A  module  for  combining  raster  data,  with  vector  linework  and  textual  annotations  for  the 
construction  of  cartographically  correct  maps 


1  Dutton,  G„  “Improving  Spatial  Analysis  in  GIS  Environments,”  Auto-Carlo  10  (March  1991),  pp  168-185. 

1  Joseph,  J.V.,  ct  al„  “Object  Oriented  Databases:  Design  and  Implementation,"  Proceedings  of  the  IEEE,  79(1)  (Institute  of 
Electrical  and  Electronics  Engineers  [IEEE],  January  1991),  pp  42-64. 

'  Westcrvclt,  J.,  “The  Two  Classes  of  GIS  Users  (Or  How  to  Make  Software  Salad),"  GIS  World,  3(5)  (October  1990),  pp  111- 
112. 
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•  Modules  for  extending  the  set  of  data  types  to  include  problem-specific  and  complex  data 
objects.  (An  example  here  would  be  a  watershed  object  made  of  several  component  objects 
such  as  reservoir  objects,  stream-reach  objects,  diversion  objects,  waste  treatment  plant  objects, 
and  so  on.  Multiple  instances  of  these  objects  would  commonly  be  connected  to  form  a  specific 
watershed  being  modeled.) 

To  implement  this  concept  the  GIS  of  the  future  should  be  constructed  on  top  of  an  extended 
RDBMS  or  an  OODBMS  that  can  handle  many  types  of  data,  both  spatial  and  aspatial/  Figure  1  depicts 
one  possible  organization  of  such  a  GIS.  This  organization  is  layered,  with  each  layer  building  on  the  one 
below  and  providing  the  functionality  needed  to  build  the  next  layer.  The  underlying  layer  of  DBMS 
software  is  accessible  via  an  application  programming  interface  (API)  and/or  a  query  interface  (QI).  If 
users  want  to  browse  the  database  before  or  during  application  execution,  the  query  interface  provides  this 
function.  Functional  modules  are  then  constructed  using  the  API  to  .tore  and  retrieve  data.  These 
functional  modules  are  used  in  applications  (APP)  along  with  the  API  to  form  the  next  level  of  access. 
Finally,  the  graphical  user  interface  (GUI)  level  is  on  top.  Note  that  the  user  interface  level  has  access 
to  applications,  functional  modules,  and  the  QI.  The  following  paragraphs  describe  the  four  levels  of  use 
according  to  the  type  of  person  accessing  that  level. 

The  End  User  Level.  From  the  standpoint  of  the  end  user,  this  GIS  would  allow  both  direct  queries 
of  the  geographic  database  and  the  ability  to  graphically  invoke  application  programs.  This  level  of  user 
might  see  a  palette  of  icons  (an  “icon  manager")  representing  applications  and  functional  program  modules 
used  to  operate  on  data.  Each  application  or  functional  module  would  have  its  own  interface  for  gathering 
information  about  input  “sources,”  output  “sinks,”  and  application  parameters. 

The  Application  Programmer  Level.  The  applicable  programmer  would  use  the  GIS  at  the  end  user 
level  to  create  and  register  data  flow  applications  for  the  end  users  who  use  macros.  The  macros  would 
be  created  using  a  regular  text-based  editor  or  a  visual  programming  environment  (VPE).  The  VPE  would 
allow  application  programmers  to  assemble  functional  modules  in  a  data  flow  arrangement,  and  register 
the  completed  “pipeline”  (application)  with  the  icon  manager.  An  example  of  a  VPE  is  depicted  in  Figure 
2.  Assume  an  application  is  desired  that  consists  of  applying  functions  X,  Y,  and  Z  consecutively  to  an 
input  map  and  then  displaying  the  output  map.  The  application  programmer  need  only  drag  icons 
representing  each  of  the  five  operations  (input  raster  map,  {unction  X,  function  Y,  function  Z,  and  display 
raster  map)  from  the  icon  palette  into  the  visual  programming  area,  connect  them  with  rubber-banding 
lines  that  represent  the  flow  of  data  between  the  modules,  edit  parameters  associated  with  each  module 
so  the  modules  perform  as  expected  (in  an  editor  not  shown),  and  press  a  start  button  in  the  control  panel 
(not  shown)  to  initiate  the  application  for  testing. 

The  Module  Programme!  Level.  The  module  programmer  would  use  an  application  programming 
interface  to  the  database  to  create  new  functional  modules  using  a  high-level  programming  language,  and 
may  incorporate  embedded  query  language  code  directly  into  application  code  as  a  cleaner  interface  to 
database  functionality.  This  violates  the  level  modularity  presented  here  somewhat,  but  as  long  as  new 
functionality  added  at  level  4  is  supported,  then  the  movement  of  this  functionality  upward  through  the 
levels  can  be  supported.  Modules  created  at  this  level  can  then  be  integrated  into  the  VPE  for  use  by  level 
2  users. 

The  System  Programmer  Level.  The  system  programmer  would  design  new  database  types  and 
create  methods  to  operate  on  these  types,  thus  extending  the  functionality  of  the  database.  This  person 
would  also  be  responsible  for  creating  access  to  new  functionality  (methods)  via  the  API  and  the 
embedded  spatial  query  language  interlaces  for  use  by  level  3  users. 

The  modules  expected  in  such  an  environment  should  include  the  following  categories  of  capability: 
(1)  image  processing,  (2)  modeling  and  analysis,  and  (3)  data  input,  output,  and  editing.  Each  of  these 


4  Frank,  A.U.,  “Requirements  for  a  Database  Management  System  for  a  GIS,"  Phologrammetric  Engineering  and  Remote  Sensing, 
54(11)  (November  1988),  pp  1557-1564. 
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Figure  1.  One  Possible  Organization  of  a  Future  GIS. 


Figure  2.  A  Visual  Programming  Environment. 


categories  of  functionality  is  discussed  in  the  following  sections.  The  next  section  details  the  software 
components  described  above. 

Database  Software 

Clearly,  one  very  important  component  of  GIS  software  is  that  dealing  with  storage  and  retrieval 
of  the  data.  Equally  important  are  the  data  structures  used  to  implement  this  software.5  Many 
approaches  could  be  taken  in  applying  database  software  to  a  GIS.  GIS  functionality  could  be  built 
around  a  custom  DBMS,  or  the  GIS  database  could  be  designed  and  written  from  scratch  using  specialized 
storage  and  retrieval  techniques  (GRASS  is  currently  implemented  the  latter  way).  Both  of  these 
implementation  strategies  and  their  implications  are  discussed  in  Chapter  3.  Several  key  areas  identified 
merit  further  study  and  could  help  guide  the  development  of  future  GIS  database  software.  They  are 
discussed  below. 

Object-Oriented  and  Extended  Relational  Database  Systems 

Currently,  RDBMSs  provide  a  wide  variety  of  services,  but  in  general  they  fail  to  fully  support 
applications  where  the  data  arc  more  complex,  such  as  a  GIS.  DBMSs  currently  provide  a  fixed  set  of 
basic  data  types  (character,  integer,  floating  point,  dollar  amount,  date,  etc.)  and  a  few  functions  to  operate 
on  them.  Much  research  has  been  (and  is)  examining  this  difficulty  in  light  of  the  increasing  number  of 
applications  requiring  a  more  capable  DBMS.6  Much  research  has  concerned  complex  object  (often 
referred  to  as  non-first  normal  form,  or  NF2)  databases.7  Complex  object  databases  are  either  an  extension 
of  the  relational  database  paradigm8  or  are  created  by  building  objects  out  of  subobjects  in  an  object- 
oriented  fashion.9  Using  these  models  it  is  possible  to  model  complex  structured  objects.  These 
databases  have  mainly  been  targeted  for  application  to  computer-aided  design  (CAD)  and  computer-aided 
manufacturing  (CAM)  databases.10 

Much  research  has  also  focused  on  the  confluence  of  the  object-oriented  programming  paradigm  and 
the  databases  which  has  resulted  in  OODBMS.11,12,13-141516”  Essentially,  these  systems  merge 
the  object-oriented  language  paradigm  with  those  of  database  systems.  OODBMSs  do  not  follow  the 


5  Cowen,  D.J.,  "GIS  versus  CAD  versus  DBMS:  What  are  the  Differences?,”  Photogrammetric  Engineering  and  Remote  Sensing, 
54(11)  (November  1988),  pp  1551-1555. 

6  Wilms,  P.F.,  P.M.  Schwarz,  H.J.  Schek,  and  L.M.  Haas,  “Incorporating  Data  Types  in  an  Extensible  Database  Architecture,” 

Proc.  3rd  Int.  Conf.  Data  and  Knowledge  Bases:  Improving  Usability  and  Responsiveness  (June  1988),  pp  180-192. 

7  Schek,  H.J.,  et  al.,  “The  DASDBS  Project:  Objectives,  Experiences,  and  Future  Prospers ,”  IEEE  Trans.  on  Knowledge  and  Data 
Engineering  (March  1990),  pp  25-43. 

8  Dadam,  P.,  et  al.,  "A  DBMS  Prototype  to  Support  Extended  NF2  Relations:  An  Integrated  View  on  Flat  Tables  and  Hierarchies,” 
Proc.  1986  ACM  SIGMOD  (May  1986),  pp  356-367. 

*  Harder,  T.,  H.  Schoning,  and  A.  Sikeler,  “Parallel  Query  Evaluation:  A  New  Approach  to  Complex  Object  Processing,"  IEEE 
Database  Engineering,  Vol  8  (1989),  pp  21-29. 

10  Wilkes,  W„  P.  Klahold,  and  G.  Schlagcter,  “(  oinplex  and  Composite  Objects  in  CAD/CAM  Databases,”  Proc.  5“  Int.  Cortf. 

Data  Engineering  (February  1989),  ftp  11-450. 

"  Fishman,  D.H.,  et  al.,  “Overview  of  the  IRI3  DBMS”  in  W.  Kim  and  F.H.  Lochovsky,  eds.,  Object-Oriented  Concepts, 
Databases,  and  Applications  (Addison  Wesley,  Reading,  MA,  1987),  pp  219-249. 

12  Wilkinson,  K„  P.  Lyngbaek,  an  1  W.  Hasan,  “The  IRIS  Architecture  and  Implementation,”  IEEE  Trans,  on  Knowledge  and 
Database  Engineering,  2(1)  (March  1990),  pp  63-75. 

13  Deux,  O.,  et  al.,  “The  Story  of  02,”  IEEE  Trans,  on  Knowledge  and  Data  Engineering,  2(1)  (March  1990),  pp  91-108. 

14  Kim.  W.,  et  al.,  “Architecture  of  the  ORION  Next-Generation  Database  System,"  IEEE  Trans,  on  Knowledge  and  Data 
Engineering,  2(1)  (March  1990),  pp  109-124. 

15  Agrawal,  R„  and  N.H.  Gehani,  “ODE  (Object  Database  and  Environment):  The  Language  and  the  Data  Model,"  ACM  SIGMOD 
Record  (June  1989),  pp  34-43. 

16  Boral,  H„  et  al.,  “Prototyping  Bubba,  A  Highly  Parallel  Database  System,"  IEEE  Trans,  on  Knowledge  and  Data  Engineering, 
2(1)  (March  1990),  pp  4-23. 

17  Haas,  L.M.,  et  al.,  “Starburst  Mid-Flight:  As  the  Dust  Clears,”  IEEE  Trans,  on  Knowledge  and  Data  Engineering,  2(1 )  (March 
1990),  pp  143-159. 
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relational  model  and,  therefore,  do  not  typically  include  functionality  commonly  available  in  RDBMS 
(such  as  structured  query  language  [SQL],  although  new  query  languages  are  being  explored  for  these 
systems18).  OODBMs  are  cited  as  having  advantages  for  CAD/CAM  databases,  artificial  intelligence 
applications,  and  office  information  systems. 

Capabilities  that  have  been  explored  include:  rich  data  modeling  constructs,  direct  database  support 
for  inference,  novel  data  types  (especially  graphic  images,  voice,  text,  vectors,  matrices),  persistent  data 
types  (i.e.,  data  that  keeps  its  value  across  database  sessions),  long  transactions  (those  lasting  minutes  to 
days),  and  multiple  versions  of  data.  The  OODBMS  has  been  shown  to  be  superior  to  the  RDBMS  in  that 
the  schema  is  superior:  it  is  capable  of  providing  convenient  access  to  information,  and  it  is  easier  to 
extend  and  maintain.19 

Some  very  interesting  research  has  been  done  by  Stonebraker  and  Rowe20  with  the  design  of 
POSTGRES,  the  successor  to  the  INGRES  relational  database  system.21  According  to  Stonebraker  and 
Rowe,  the  main  design  goals  of  POSTGRES  are  to: 

1 .  Provide  better  support  for  complex  objects 

2.  Provide  user  extensibility  for  data  types,  operators,  and  access  methods 

3.  Provide  facilities  for  active  databases  and  inferencing,  including  forward-  and  backward- 
chaining 

4.  Simplify  the  DBMS  code  for  crash  recovery 

5.  Produce  a  design  that  can  take  advantage  of  optical  disks,  workstations  composed 
of  multiple  tightly  coupled  processors,  and  custom  designed  VLSI*  chips 

6.  Make  as  few  changes  as  possible— preferably  none— to  the  relational  model. 

The  benefit  of  the  POSTGRES  model  over  the  others  is  that  all  of  the  useful  characteristics  of  the 
relational  model  are  retained  while  exploiting  the  benefits  of  the  object-oriented  model.  Gearly,  the  first 
two  goals  listed  above  satisfy  most  of  the  modeling  needs  of  engineers,  but  the  remaining  goals  satisfy 
the  need  to  take  into  account  future  hardware  technologies.  Furthermore,  because  POSTGRES  source 
code  is  in  the  public  domain,  its  usefulness  for  specific  research  into  geographic  application  is  enhanced. 
Recently,  a  large  project  called  Sequoia  2000  was  initiated  at  several  campuses  of  The  University  of 
California;  one  of  this  project’s  goals  is  to  integrate  GRASS  with  POSTGRES.22  Although  it  is  not  the 
ultimate  goal  of  Sequoia  2000,  this  integration  should  provide  important  insights  into  the  problems  with 
building  a  GIS  with  a  RDBMS  core. 


'*  Blakeley,  J.A.,  C.W.  Thompson,  and  A  M.  Alashqur,  “Strawman  Reference  Model  for  Object  Query  Languages,” 
X.VSPARC/DBSSGlOODBTG  Task  Group  Workshop  (October  1990). 

”  Kelabchi,  M.A..  S.  Mathur,  T.  Kisch,  and  J.  Chen,  “Comparative  Analysis  of  RDBMS  and  OODBMS:  A  Case  Study,"  Proc. 
IEEE  COMECON  90  (February  1990),  pp  528-535. 

2"  Stonebraker,  M.,  and  L.  Rowe,  ‘The  Design  of  POSTGRES,”  IEEE  Database  Engineering ,  Vol  6  (1987). 

21  Stonebraker,  M.,  el  al..  “The  Design  and  Implementation  of  INGRES,”  ACM  Trans,  on  Database  Systems,  1(3)  (September 
197b),  pp  189  222. 

VLSI:  very-large-scale  integration. 

22  Stonebraker,  M.,  and  J.  Do/icr,  Sequoia  2000:  Large  Capacity  Object  Servers  to  Support  Global  Change  Research,  Technical 
Proposal  (University  of  California,  May  1991). 
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While  it  appears  that  the  work  of  the  POSTGRES  team  may  have  its  advantages,  it  is  not  clear  that 
POSTGRES  will  be  able  to  process  the  large  amount  of  data  required  in  a  raster  GIS  with  the  level  of 
performance  desired  by  users.  Also,  the  advantages  offered  by  the  object-oriented  paradigm  are  not 
explicitly  available  in  the  POSTGRES  system  as  of  version  3.0.  The  reader  is  referred  to  several  papers 
of  interest  and  insight  by  the  POSTGRES  development  team.23,24,25,26,27'28,29. 

Distributed  Database  Technology 

Distributed  database  technology  is  important  because  users  of  computer  systems  in  general  are 
becoming  decentralized.  In  other  words,  they  are  relying  more  than  ever  on  sharing  data  over  networks. 
As  long  as  there  exists  a  network  of  homogeneous  databases  that  somehow  can  find  each  other,  users  can 
have  truly  distributed  databases.  This  is  a  very  powerful  concept  because  it  allows  local  sites  to  have  data 
that  are  used  routinely  in  a  local  database  while  allowing  less-used  databases  to  remain  at  the  sites  where 
they  are  most  needed.30  Data  distribution  will  prevent  the  need  for  costly  replication  of  geographic 
databases. 

As  an  aside,  it  should  be  mentioned  that  the  “open  systems”  concept— very  loosely  defined  as  the 
provision  of  system  architecture  specifics  needed  to  integrate  or  move  software  across  platforms— is  going 
forward  and  the  possibility  of  interoperability  among  heterogeneous  systems  is  imminent.  The 
International  Standards  Organization’s  Open  Systems  Interconnection  (ISO-OSI)31  and  Open  Software 
Foundation’s  (OSF)  Application  Environment  Specification  (AES),32  to  name  only  two,  are  indications 
that  there  is  significant  interest  in  standardizing  interconnection  and  user  environments.  This  interest 
indicates  that  open  systems  will  become  a  reality  in  the  future.  What  this  may  mean  from  a  database 
standpoint  is  that  database  software  will  be  able  to  provide  direct,  read-only  access  to  Oracle®,  for 
example,  or  IBM  Database  2  (DB2)  databases,  whether  residing  locally  or  on  remote  systems.  The  major 
concerns  here  are  probably  how  to  deal  with  concurrent  access,  network  permissions,  security,  and  how 
to  deal  with  long  transactions  that  might  be  interrupted,  but  these  problems  are  currently  being  studied. 
In  addition,  and  most  importantly,  there  are  institutional  constraints  that  would  effectively  prevent  the 
sharing  of  data  in  the  first  place. 

Spatial  Query  Languages 

The  implications  of  allowing  a  user  to  directly  query  the  database,  as  opposed  to  controlled  queries 
via  interfacing  programs,  should  be  studied  in  detail.  Having  said  this,  the  user  may  find  it  helpful  to  be 
able  to  make  a  custom  query  directly  on  the  database  instead  of  having  a  programmer  somehow  develop 
an  interface  to  the  database  software.  A  user  might  wish  to  be  able  to  directly  query  the  database  for 


21  Stonebraker,  M.,  and  J.  Dozier,  May  1991. 

24  Stonebraker,  M.,  “Inclusion  of  New  Types  in  Relational  Database  Systems,”  Proc.  2nd  Int.  IEEE  Conf.  Data  Engineering 
(February  1986). 

25  Stonebraker,  M.,  “The  POSTGRES  Storage  System,”  Proc.  I2'h  Very  Large  Database  Conference  (IEEE,  1987). 

26  Stonebraker,  M.,  et  al„  "Extensibility  in  POSTGRES,”  IEEE  Database  Engineering,  Vol  6  (IEEE.  1987). 

27  Rowe,  L.,  and  M.  Stonebraker,  “The  POSTGRES  Model,”  Proc.  I2,h  Very  Large  Database  Conference  (IEEE,  1987),  pp  83-96. 

2.1  Greene,  D.,  “An  Implementation  and  Performance  Analysis  of  Spatial  Data  Access  Methods,”  Proc.  5'h  Int.  IEEE  Conf  Data 
Engineering  (February  1989),  pp  606-615. 

29  Stonebraker,  M.,  L.  Rowe,  and  M.  Hirohama,  “The  Implementation  of  POSTGRES,”  IEEE  Trans,  on  Knowledge  and  Data 
Engineering,  2(1)  (March  1990). 

3.1  Kameny,  I.,  “Global  Information  System  Issues,"  Proc.  5th  Int.  IEEE  Conf.  Data  Engineering  (February  1987),  pp  672-673. 

31  Zimmerman,  H.,  “OSI  Reference  Model— The  ISO  Model  of  Architecture  for  Open  Systems  Interconnection,"  IEEE.  Trans  on 
Communications,  Com-28(4)  (April  1980),  pp  425-432. 

32  Open  Software  Foundation,  Application  Environment  Specification  (Prentice-Hall,  1990). 
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normal  analysis  purposes.  For  example,  a  user  should  be  able  to  give  a  generic  command  like  “create  new 
raster  =  good_sites  where  soils  =  [  1 ,3,7,9-12]  and  slope  <  5  and  land_use  !=  [2, 3, 4,7-9]  and  distance  from 
roads  <  200  meters.” 

The  usefulness  of  such  a  query  mechanism  seems  obvious.  Indeed,  extensive  work  has  been  done 
concerning  spatial  queries,  and  the  reader  is  referred  to  Greene33  or,  for  a  more  extensive  view,  to 
Samet.34  As  for  the  usefulness  of  SQL  as  a  basis  for  a  spatial  query  language,  this  matter  is  probably 
best  viewed  by  observing  that  systems  using  SQL  have  become  successful  in  the  past.  This  is  most  likely 
because  SQL  and  languages  like  it  are  perhaps  intuitive,  because  they  are  English-like,  and  their  meaning 
can  be  interpreted  in  most  cases  by  reading  the  query.  Clearly,  an  SQL-like  language  would  require 
special  extensions  that  would  make  sense  only  in  the  context  of  a  GIS.  For  an  excellent  discussion  of  the 
pros  and  cons  of  SQL,  see  articles  by  Stonebraker35  and  Frank.36 

Raster-Vector  Integration 

Well  researched  and  proven  data  structures  (or  classes  of  data  structures)  that  are  better  suited  to 
both  basic  types  of  geographic  data  already  exist.  To  try  to  generalize  and  adapt  one  generic  data 
structure  to  these  data  types  would  preclude  the  storage  and  manipulation  efficiencies  provided  by  the 
separate  data  types.  By  retaining  the  optimal  data  structure  for  each  data  type,  the  benefits  of  these 
structures  can  be  exploited.  The  problem  with  this  approach  is  that  specific  techniques  are  required  to 
access,  manipulate,  and  display  each  data  type.  The  real  trouble  comes  when  the  data  type  to  be  stored 
is  complex  or  interrelated  with  other  data  (although  the  POSTGRES  system  mentioned  earlier  provides 
some  of  this  capability).  Issues  and  suggestions  relating  to  data  structures  are  discussed  further  in 
Chapter  3. 


Image  Processing 

Image  processing  has  received  much  attention  in  the  literature  lately,  especially  in  the  application 
of  parallel  processing  hardware.  This  was  no  doubt  driven  for  the  most  part  by  the  field  of  computer 
vision  because  of  the  need  to  process  images  in  real  time.37-38  It  has  been  shown  many  times  that 
images  can  be  processed  much  faster  using  parallel  hardware  and  even  using  parallel  software  techniques 
on  sequential  processors. 3<,'4n'41 

Another  technology  of  interest  for  application  to  the  image  processing  tasks— especially  pattern 
recognition  tasks— is  the  neural  network.  Limited  research  has  been  done  on  integrating  an  image 
processing  system  with  neural  networks42 


”  Circcnc.  D.,  pp  606-615. 

x'  Samet,  H.,  Applications  of  Spatial  Data  Structures:  Computer  Graphics,  Image  Processing,  and  GIS  (Addison-Wesley,  Reading, 
MA.  1990). 

”  Stonebraker,  M.,  “Future  Trends  in  Data  Base  Systems,”  IEEE  Trans,  on  Knowledge  and  Data  Engineering,  1(1)  (March  1989), 
pp  33-44. 

6  Frank,  A.U. 

17  Rice,  T.A.,  and  L.H.  Jamieson,  “Parallel  Processing  for  Computer  Vision”  in  S.  Levialdi,  ed..  Integrated  Technology  for  Parallel 
Image  Processing  (Academic  Press,  1985). 

111  Dew,  P.M.,  R.A.  Eamshaw,  and  T.R.  Heywood,  eds.,  Parallel  Processing  for  Computer  Vision  and  Display  (Addison-Wesley, 
NY,  m‘}f. 

”  Dew,  P.M.,  ct  al. 

70  Duller,  A.W.G.,  R.H.  Slorer,  A.R.  Thomson,  and  E.L.  Dagless,  "An  Associative  Processor  Array  for  Image  Processing,”  Image 
and  Vision  Computing,  7(2)  (May  1989),  pp  151-158. 

■"  Page,  l„  ed..  Parallel  Architectures  and  Computer  Vision  (Clarendon  Press,  1988). 

'J  Wti,  X.,  and  J.D.  Westervelt,  Engineering  Applications  of  Neural  Computing:  A  State  of  the  Art  Survey,  Special  Report  N- 
9I/22/ADA237628  (U  S.  Army  Construction  Engineering  Research  Laboratory  [USACERL],  May  1991). 
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Apart  from  the  fact  that  image  processing  will  take  advantage  of  parallel  architectures  is  the  question 
of  the  software  that  would  drive  the  application.  Integrated  image  processing  software  packages  abound, 
both  in  the  public  domain  and  especially  in  the  private  sector  as  vendors  scramble  to  meet  the  needs  of 
industry  and  government.43,44,45  Possible  new  image  processing  software  for  GIS  application  might 
include  modules  for  automatic  image  classification,  automatic  feature  extraction  (for  automatically 
extracting  certain  data  and  possibly  for  removal  of  cartographic  annotations  in  scanning  paper  maps), 
change  detection  and  analyses,  etc.  Also  possible  are  fast  implementations  of  the  standard  set  of  image 
processing  algorithms  for  geometric  correction,  image  enhancement  and  filtering,  and  classification. 

Image  processing  for  GIS  application  should  be  able  to  directly  use  advances  in  image  processing 
for  other  applications.  In  fact,  the  authors  believe  most  image  processing  software  could  be  applied 
directly  because  there  is  no  need  for  the  software  to  recognize  a  geographic  reference  in  order  to  perform 
its  tasks.  The  user  need  only  supply  the  data  as  a  matrix  and  let  the  database  software  handle  the 
locational  issues. 


Modeling  and  Analysis 

This  area  of  software  is  key  to  the  successful  application  of  GIS  technology  by  engineers  modeling 
physical  systems.  If  the  appropriate  tools  are  available  to  the  engineer,  a  successful  application  of  GIS 
technology  can  be  realized.  However,  some  areas  are  lacking  in  current  GISs,  and  these  must  be 
addressed. 

As  stated  previously,  an  engineer  can  model  a  system  using  objects.  The  modeler  can  tailor  an 
object  to  have  all  of  the  characteristics  and  behavior  needed  to  model  the  system  in  question.  The 
database  software  can  help  in  this  by  facilitating  the  creation  of  objects  and  corresponding  methods  that 
act  on  the  object  and  communication  between  objects.  Once  the  modeler  has  implemented  the  objects 
needed  to  model  a  system,  there  must  be  a  mechanism  for  simulating  the  behavior  of  the  system  over 
time.  The  database  software  can  again  come  to  the  aid  of  the  modeler  by  supplying  rules  (or  “triggers”) 
and  timers  that  can  be  associated  with  the  data  objects.  Triggers  monitor  the  database  for  some  conditions 
and  they  are  associated  with  objects.  Triggers  make  it  possible  to  perform  actions  within  the  database 
without  actively  using  the  database.  Once  a  trigger  condition  is  met,  the  trigger  action-usually  some  form 
of  database  transaction— is  scheduled  for  completion.  In  this  way,  the  database  becomes  active  and  triggers 
can  start  a  flurry  of  activity  without  the  presence  of  an  operator.  Timers  can  also  be  used  in  conjunction 
with  rules  or  triggers  to  provide  a  simulation  mode.  Timers  can  be  tied  to  real  wall-clock  time  or  they 
can  be  used  to  generate  virtual  time  increments.  Timers  can  vary  in  “granularity,”  meaning  that  a  timer 
click  can  be  viewed  as  anything  from  a  microsecond  to  a  millennium. 

Software  for  performing  many  tasks  in  modeling  and  analysis  of  engineering  data  is  available 
currently.  Software  developed  in  this  category  will  be  driven  in  large  part  by  the  applications  that  will 
be  funded  for  development.  Hydrologic  modeling  applications  will  probably  appear  first,  as  there  seems 
to  be  a  great  interest  in  solving  these  problems  within  a  GIS  framework.46 


41  Brown,  V.,  and  A.M.  Gcmazian,  "The  March  of  GIS,”  Workstation  News,  2(2 )  (February  1991).  p  24. 

44  Harmon,  S.,  “The  Image  Makers,”  Workstation  News,  2(2)  (February  1991),  p  22-23. 

45  Wamick,  L.,  and  D.  Blaylock  “Theory  of  Revolution,”  Workstation  News,  2(2)  (February  1991),  pp  26-27. 

46  Johnson,  L.E.,  et  a!.,  “Geographic  Information  Systems  for  Hydrologic  Modeling,"  Proc.  3d  ASCP  Water  Resources  Operations 
Management  Workshop:  Computerized  Decision  Support  Systems  for  Water  Managers  (June  1988),  pp  736-749. 
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Map  Data  Input,  Output,  and  Editing 

Input  of  geographic  uata  using  digital  tape,  optical  disks,  scanners,  tablet  digitizers,  and  even  video 
should  be  available,  and  input  by  data  conversion  from  other  formats  should  also  be  available. 
Sophisticated  output  of  the  data  should  also  be  facilitated  by  supporting  plotters,  printers,  multiple 
concurrent  displays,  and  video. 

The  GIS  should  also  be  able  to  convert  data  into  other  formats  to  allow  transfer  of  data  between 
systems.  The  user  should  be  able  to  create  new  data  or  modify  existing  data  through  some  form  of  editing 
(on-screen  digitizing)  capability.  This  editing  facility  might  even  include  a  CAD-like  capability  with  the 
ability  to  edit  all  geographic  data  types.  This  would  be  a  useful  capability,  considering  that  much  digital 
cartography  is  simply  electronic  drafting.47  Extension  of  this  editing  capability  to  GIS  would  present 
problems  as  well  as  solve  them.  This  capability  would  be  facilitated  by  database  software  that  could 
handle  complex  data  objects.48  Import  and  export  facilities  that  could  interface  with  popular  CAD 
formats  would  T.erefore  be  desirable. 


User  Interface 

It  is  likely  that  all  of  the  software  predicted  in  this  report  will  employ  sophisticated  GUIs.  The 
proliferation  of  user  interface  development  products  and  research  indicates  that  the  industry  expects  GUIs 
to  be  a  very  important  component  of  future  software  systems.  In  fact,  GISs  have  been  criticized  heavily 
because  they  are  traditionally  difficult  to  learn,  use,  and  customize.  Furthermore,  user  proficiency  gained 
on  one  product  is  not  easily  transferred  to  another.49  The  user  interface  is  the  appropriate  place  to  “glue” 
all  of  a  system’s  disparate  modules  together  into  a  cohesive  whole.  There  is  little  doubt  that  GUIs  are 
the  “way  of  the  future,”  if  one  can  judge  by  the  number  of  articles  on  the  topic  in  recent  trade  journals. 

The  X  Window  System"'  is  a  network-based  graphics  system  developed  by  MIT  and  adopted  as  an 
industry  standard.  It  is  expected  that  most  GUIs  developed  for  Unix-based  GISs  in  the  future  will  use  the 
X  Window  System,  probably  with  a  user  interface  such  as  OSF/Motif. 

Many  GISs  available  today  are  difficult  to  leam,  use,  and  customize.  Furthermore,  experience 
gained  from  using  one  GIS  is  not  readily  transferrable  to  the  use  of  another.  These  problems  have  been 
addressed  by  UGIX,50  an  interface  environment  that  is  independent  of  the  underlying  GIS.  UGIX  could 
be  used  as  the  basis  for  a  more  standardized  user  interface  if  widely  accepted  by  the  GIS  user  community. 

GISs  tend  to  be  large  collections  of  pre  "am  modules.  It  is  often  difficult  to  combine  these  modules 
into  a  coherent  new  application.  Visual  programming  environments  such  as  those  found  in  the  XVision 
image  processing  system51  and  the  animation  production  Environment  (apE)  visualization  software52 
arc  becoming  very  popular  because  they  allow  users  to  create  applications  by  graphically  arranging 
program  modules  on  the  screen.  VPEs  use  icons  (or  “glyphs”)  to  represent  independent  program  modules. 
These  icons  are  then  connected  graphically  in  terms  of  data  flow  to  build  specific  applications. 


17  Cowen,  D.J. 

4a  Wilkes,  W„  el  al. 

Rapcr,  J.F.,  and  M.S.  Bundock,  “UGIX:  A  GIS  Independent  User  Interface  Environment,”  Auto-Carlo  10  (March  1991),  pp  275- 
295. 

Raper,  J.F.,  and  M.S.  Burdock. 

"  Rasurc,  J.,  S.  Hallelt.  and  R.  Jordan.  “A  Comprehensive  Software  System  for  Image  Processing  and  Programming,”  SPIE 
Proceedings  Vol.  1075,  Digital  Image  Processing  Applications  (January  1989),  pp  37-45. 

'2  Anderson,  H.S.,  ct  al.,  The  Animation  Production  Environment:  A  Basis  for  Visualization  and  Animation  of  Scientific  Data,  Ohio 
.Supercomputer  Graphics  Project  Technical  Re[xirt  TR-04  (Ohio  Stale  University,  March  1989). 
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It  is  likely  that  the  GIS  of  the  future  will  incorporate  at  least  the  following  features  within  its  GUI: 

•  An  easy-to-use  API 

•  A  set  of  command-line  programs  built  with  the  API 

•  A  VPE  for  building  applications  based  on  the  command-line  programs 

•  A  GUI  including  an  overall  interface  to  all  GIS  functions  and  a  tool  for  performing  direct 
database  queries  (as  described  previously). 

The  GUI  described  in  the  last  point  above  should  integrate  functionality  provided  by  the  API  and 
the  functionality  contained  in  the  command  line  (and  as  extended  by  the  VPE). 

The  API  will  be  built  directly  on  the  DBS,  allowing  low-level  access  to  data  and  database 
functionality.  The  ability  to  include  in  application  programs  embedded  queries,  similar  to  the  query 
language,  would  also  be  a  positive  feature. 
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3  IMPLEMENTATION  ISSUES  AND  STRATEGIES 


As  stated  earlier,  the  GIS  functionality  described  in  this  report  could  be  built  into  a  custom  DBMS, 
or  the  GIS  database  could  be  designed  and  written  from  scratch  using  specialized  storage  and  retrieval 
techniques  designed  specifically  for  geographic  data  types.  Before  beginning  a  discussion  of  the  pros  and 
cores  of  each  strategy,  the  authors  will  present  some  background  that  should  be  kept  in  mind  during  that 
discussion.  Because  any  next-generation  GIS  must  address  anticipated  advances  in  hardware  technology, 
it  is  important  to  mention  some  interesting  technological  issues,  and  their  relationship  to  GIS. 


Hardware  and  Software  Issues 

This  section  describes  several  important  issues  concerning  the  implementation  of  advanced  GIS 
software  on  advanced  hardware. 

The  Need  for  High  Performance 

The  application  of  GIS  technology  to  engineering  modeling  tasks  has  been  very  limited.  One  may 
wonder  why,  considering  that  the  GIS  model  fits  well  with  many  engineering  modeling  tasks.  The  answer 
may  simply  be  that  the  hardware  platforms  upon  which  GISs  are  implemented  do  not  supply  the 
“horsepower”  needed  for  many  engineering  tasks,  but  actually  it  is  more  likely  to  result  from  a 
combination  of  factors.  Hardware  speed  is  certainly  one  of  these  factors.  Another  is  that  GISs,  as 
currently  implemented,  simply  do  not  provide  the  functionality  required  for  many  engineering  models. 
One  key  function  missing  from  GISs  is  the  ability  to  effectively  add  a  temporal  dimension  to  a  model. 
This  shortcoming  could  be  addressed  by  implementing  rules  and  timers  in  the  GIS  database  (as  discussed 
in  the  previous  chapter  under  “Database  Software”  and  “Modeling  and  Analysis”). 

The  answer  to  this  question  also  depends  on  the  applications  faced  by  the  GIS  user.  For  real-time 
or  pseudo  real-time  applications,  or  for  computation-intensive  procedures,  high  performance  hardware 
platforms  are  a  must.  However,  if  rapid  gains  in  sequential  machine  performance  continue  at  the  present 
pace,  it  may  be  unnecessary  to  parallelize  all  but  the  most  compute-intensive  algorithms. 

Parallel  Architectures 

In  an  attempt  to  offer  some  insight  into  the  application  of  parallel  processing  technology  to  GISs, 
three  basic  categories  of  parallel  computer  architecture  are  described  below:  SIMD  (single  instruction 
multiple  data),  MIMD  (multiple  instruction  multiple  data),  and  hybrid  SIMD/MIMD  architectures. 

SIMD  machines  represent  a  form  of  synchronous,  highly  parallel  processing.53  Systems  with  as 
many  as  216  processors  are  currently  available  commercially.  An  SIMD  machine  consists  of  a  control 
unit,  a  set  of  processing  elements  (PEs),  each  with  its  own  memory,  and  an  interconnection  network.  The 
control  unit  broadcasts  instructions  to  all  PEs,  and  each  active  PE  executes  the  instruction  on  the  data  in 
its  own  memory.  The  interconnection  network  allows  data  to  be  transferred  among  the  PEs.  This 
arrangement  is  well  suited  for  exploiting  the  parallelism  inherent  in  certain  tasks  performed  on  vectors  and 
arrays.  As  SIMD  architectures  mature,  the  speed  of  nearest-neighbor  and  inter-PE  communication  will 
reach  very  acceptable  levels.  In  fact,  configurations  exist  where  intercommunication  rates  have  already 
exceeded  1  gigabyte  per  second.  One  problem  with  the  SIMD  configuration  is  the  transfer  of  data  from 


55  Duncan,  R.,  “A  Survey  of  Parallel  Computer  Architectures."  IEEE  Computer,  23(2)  (February  1990),  pp  5-16. 


17 


disk  to  processors  and  vice  versa  (input/output  subsystem).  Input/output  (I/O)  rates  have  reached  1 
gigabyte  per  second  in  systems  with  specially  designed  hardware,  however. 

MIMD  machines  represent  asynchronous  parallel  processing.54  MIMD  systems  with  many 
thousands  of  processors  are  currently  available,  although  the  number  of  processors  is  usually  under  a  100. 
A  MIMD  machine  usually  consists  of  P  processors  and  M  memories,  M  greater  than  or  equal  to  P,  where 
each  processor  can  follow  an  independent  instruction  stream.  Two  major  categories  of  MIMD  architecture 
are  shared  memory  and  local  memory. 

In  shared-memory  MIMD  architectures  the  PEs  share  a  common  global  memory,  or  have  access  to 
both  global  and  local  memory.  In  local-memory  (or  “distributed  memory”)  MIMD  architectures,  each  PE 
has  its  own  local  memory,  and  data  are  transferred  via  an  interconnection  network  much  like  an  SIMD 
architecture.  MIMD  architectures  arc  useful  for  general  parallel  processing  tasks.  In  terms  of  GIS 
implementation,  the  main  disadvantage  of  MIMD  architectures  is  that  computation  is  asynchronous.  Many 
GIS  computations  must  be  synchronous,  however,  because  they  must  share  data  and,  thus,  rely  on  the 
completion  of  certain  computations  by  other  processors.  This  synchronization  problem  leads  to 
programming  difficulties  and  does  not  take  full  advantage  of  the  capability  of  the  architecture. 

Most  commercially  available  parallel  machines  are  actually  a  hybrid  of  SIMD  and  MIMD 
architectures.  At  the  extreme  end  of  the  hybrid  spectrum  are  truly  configurable  SIMD/MIMD  machines 
like  the  PASM55  (partitionablc  SIMD/MIMD)  system.  The  model  assumeu  here  combines  SIMD  and 
MIMD  attributes.  Each  PE  contains  the  same  code  but  executes  the  code  on  different  data.  However, 
within  each  PE,  the  code  can  run  in  MIMD  mode.  This  modification  of  the  basic  model  allows  faster 
execution  on  some  code  than  in  the  pure  SIMD  model  without  the  problems  inherent  »o  the  full  flexibility 
of  an  MIMD  machine.  This  architecture  has  been  proven  successful  in  practice  for  computer  vision,  image 
processing,  and  pattern  recognition  tasks. 

There  is  no  way  to  know  with  certainty  which,  if  any,  of  the  architectures  outlined  above  will 
actually  be  available  in  a  low-cost  workstation  by  the  year  2000,  but  one  can  make  some  reasonable 
projections  based  on  industry  trends.  SIMD  architectures  are  currently  available  for  under  $200,000  (with 
up  to  1024  processors),  but  it  seems  unlikely  that  this  architecture  will  be  common  in  workstations  because 
it  is  well  suited  for  only  a  relatively  small  class  of  applications.  However,  it  will  be  necessary  to  monitor 
progress  in  the  area  of  SIMD  hardware.  Relatively  inexpensive  add-on  hardware  may  possibly  become 
available.  It  is  more  likely  that  MIMD  architecture  will  end  up  in  the  mainstream.  In  fact,  several 
vendors  currently  offer  machines  with  eight  or  more  processors  for  under  $150,000,  and  it  seems  likely 
that  the  more  general-purpose  MIMD  architecture  will  be  more  readily  available.  In  fact,  now  available 
are  add-on  hardware  devices  called  transputers,  which  each  contain  one  or  more  PEs.  When  connected 
to  a  workstation,  transputers  can  make  a  MIMD  machine  out  of  an  existing  workstation  at  a  reasonable 
cost.  In  addition,  software  that  is  now  available  allows  the  user  to  configure  a  network  of  workstations 
to  act  as  a  single  MIMD  machine.56  The  SIMD/MIMD  hybrids,  although  well  suited  to  the  GIS 
framework,  are  primarily  (and  will  probably  remain)  research  tools,  and  will  probably  not  be  available  on 
a  production  basis  for  many  years. 


54  Duncan,  R. 

55  Siegel,  H.J.  el  al.,  “PASM:  A  Partitionablc  SIMD/MIMD  System  for  Image  Processing  and  Pattern  Recognition.”  IFFF  Trans, 
on  Computers,  C-30(12)  (December  1981),  pp  934-947. 

M  Duncan,  R. 
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Technological  Bottlenecks 

Several  bottlenecks  that  currently  inhibit  performance  are  now  being  overcome,  which  will  allow 
the  successful  application  of  parallel  processing  technology  to  GISs.57  Processor  speeds  are  improving 
at  a  dramatic  rate.  However,  there  is  more  to  a  computer  system  than  the  processor.  Gains  in  memory 
and  I/O  bandwidth  must  match  the  gains  due  to  higher  processor  speeds  and  use  of  more  processors  in 
order  to  have  a  system  in  which  the  performance  potential  of  the  processors  is  realized.  Although 
significant  problems  remain,  memory  and  I/O  rates  are  reported  to  be  reaching  very  acceptable  limits.5* 

If  a  set  of  processors  needs  data  to  perform  some  operation,  it  must  usually  access  some  kind  of 
secondary  memory  (usually  magnetic  disk)  to  bring  the  desired  data  into  PE  memory.  One  possible 
solution  to  this  problem  is  to  provide  some  type  of  data  or  file  cache,  where  frequently  used  data  can  be 
stored  off-PE  until  needed.  This  could  work  because  memory  access  speeds  are  much  higher  than  disk 
transfer  speeds.  Data  could  be  cached  in  memory,  reducing  the  need  for  disk  access.  Toward  this  end, 
dynamic  random  access  memory  (DRAM)  technology  is  advancing.  This  technology  is  currently 
expensive,  but  the  cost  is  constantly  decreasing.  This  DRAM  storage  of  data  (sometimes  referred  to  as 
solid  state  disk,  or  SSD)  will  become  more  and  more  feasible  as  the  cost  per  megabyte  of  storage 
continues  to  decrease.59 

Meanwhile,  other  technologies  that  address  the  I/O  bottleneck  are  being  explored.  Read/write  optical 
disk  technology  is  maturing  rapidly,  and  data  transfer  rates  now  exceed  1  megabyte  per  second.  As  the 
cost  for  this  technology  decreases,  it  may  prove  useful  when  combined  with  disk  arrays  or  log-structured 
file  systems. 

Given  the  similar  performance  and  cost  per  megabyte  of  large  and  small  disks,  one  way  to  improve 
performance  is  to  replace  a  single  large  drive  by  an  array  of  smaller  drives.60,61  Such  an  array  provides 
higher  performance  because  many  small  requests  can  be  serviced  independently  and  large  requests  can  be 
spread  over  several  disks  to  transfer  in  parallel.  This  technique,  known  as  data  striping,  is  especially 
useful  in  systems  where  large  data  transfers  arc  a  frequent  requirement,  such  as  a  GIS.  Log-structured 
file  systems  append  the  most  recent  additions  to  the  end  of  a  log.  This  can  reduce  disk  seek  times.62 

Some  set  of  I/O  bottleneck  solutions  in  tandem  with  MIMD  architecture,  is  the  most  likely  approach 
to  be  employed  in  the  low-cost,  highly  parallel  workstation  of  the  year  2000.  It  also  seems  very  likely 
that  GISs  will  be  implemented  on  such  a  platform  to  take  advantage  of  the  increased  performance. 


”  Gaudiot,  J.L.,  “Parallel  Computing:  One  Opportunity,  Four  Challenges,"  Proc.  5'*  Int.  IEEE  Conf.  Data  Engineering  (February 
1989),  pp  482-484. 

'*  Frieder,  O.,  “Communications  Issues  in  Data  Engineering:  Have  Bandwidth— Will  Move  Data,”  Proc.  5th  Int.  IEEE  Conf.  Data 
Engineering  (February  1989),  p  674. 

,9  Bate,  G.,  “Alternative  Storage  Technologies,"  Proc.  IEEE  COMPCON  89  (February  1989),  pp  151  157. 

M  Meador,  W.E.,  “Disk  Array  Systems,”  Proc.  IEEE  COMPCON  89  (February  1989),  pp  143-146. 

*’  Patterson,  D.A.,  ct  al.,  “Introduction  to  Redundant  Arrays  of  Inexpensive  Disks  (RAID),”  Proc.  IEEE  COMPCON  89  (February 
1989),  pp  112-117. 

92  Doughs,  F.,  and  J.  Ousterhoul,  “Log  structured  File  Systems,"  Proc.  IEEE  COMPCON  89  (February  1989),  pp  124  129. 
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Notwithstanding  the  fact  that  the  architectures  themselves  are  “moving  targets,”  many  researchers 
(and  programmers)  have  pointed  out  the  difficulties  associated  with  software  development  on  parallel 
architectures  in  general.63  Potential  solutions  to  the  problem  of  programming  complexity  include: 

•  Development  of  architecturally  independent  high-level  languages  (e.g.,  Linda,64  Seymour65) 

•  Extensions  of  existing  languages  (e.g.,  Fortran  8x,  C*.  *Lisp) 

•  Vectorizing  compilers 

•  Developing  automated  tools  for  the  detection  of  parallelism  in  existing  codes. 

The  authors  believe  that,  as  research  into  these  areas  continues,  the  programmability  problem  will 
be  solved  at  least  enough  to  make  the  task  of  developing  software  manageable.  It  is  also  predicted  that 
the  solution  will  probably  employ  architecturally  independent  high-level  languages. 


Two  Implementation  Strategies 

The  following  sections  discuss  two  implementation  strategies  that  could  be  taken  to  realize  the  GIS 
and  software  described  in  Chapter  2.  Discussion  will  also  address  how  GRASS  fits  into  these  scenarios. 

Development  of  a  Specialized  Storage  and  Retrieval  System 

The  methodology  in  this  strategy  would  be  to  develop  a  new  storage  and  retrieval  system  based 
solely  on  the  geographic  data  types  deemed  necessary  to  implement  and  the  specialized  data  structures 
found  to  be  best  for  this  task.  The  data  types  of  interest  would  be  those  usually  associated  with  GISs: 
raster  and  vector  (the  latter  including  areal,  linear,  and  pointal  features).  Many  of  the  limitations  discussed 
in  Chapter  2  under  “Raster-Vector  Integration”  would  still  need  to  be  addressed,  as  well  as  a  few  more 
discussed  below. 

Systems  employing  this  strategy  are  now  starting  to  appear.  The  Environmental  Systems  Research 
Institute  (ESRI)  has  just  released  the  latest  version  of  its  GIS.  In  addition  to  its  traditional  vector-based 
functionality,  this  GIS  now  also  includes  raster-based  functionality  and  methods  for  transferring  data 
between  the  types.  Earth  Resources  Data  Analysis  System  (ERDAS)  has  recently  released 
ERDAS- ARC/INFO  Live  Link.  Both  of  these  systems,  as  well  as  others  not  mentioned  here,  arc  available 
now.  Both  seem  to  be  quality  systems  and,  through  the  INFO  portion  of  ARC/INFO,  have  the  ability  to 
handle  aspatial  data.  The  major  drawbacks  to  these  systems  are  (1)  cost  to  the  user,  (2)  the  fact  that  they 
are  hard  to  learn  and  use,  and  (3)  inflexibility  due  to  a  lack  of  customization  and  access  to  internal  data 
structures.  These  systems  are  currently  limited  to  visual  integration;  that  is,  they  arc  really  not  yet 
integrated  at  the  data  structure  level. 

Indeed  the  great  advantage  of  GRASS,  besides  its  superiority  for  raster  analysis,  has  been  its 
relatively  small  cost  and  the  direct  access  it  offers  to  internal  data  structures  for  customization  and  special 
purpose  applications.  With  these  points  in  mind,  the  development  needs  of  GRASS  may  be  explored  to 
discuss  how  the  system’s  weaknesses  might  be  remedied  by  a  specialized  storage  and  retrieval  system. 
GRASS’S  principal  development  needs  are  the  following: 

•  The  ability  to  handle  floating  point  data 

•  An  improved  and  more  complete  vector  capability 


63  Lewis,  T.G.,  “Issues  in  Parallel  Programming:  Why  Aren't  We  Having  Fun  Yet?”  Supercomputing  Review  (July  1990). 

64  Carriero,  N.,  and  D.  Gelemter,  “Linda:  Some  Current  Work,"  Proc.  IEEE  COMPCON  89  (February  1989),  pp  98-101 . 

65  Miller,  R.,  and  Q.E.  Stout,  “An  Introduction  to  the  Portable  Parallel  Programming  Language:  Seymour,"  Proc  IEEE  COMPSAC 
89  (September  1989),  pp  94-101. 
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discuss  how  the  system’s  weaknesses  might  be  remedied  by  a  specialized  storage  and  retrieval  system. 
GRASS’S  principal  development  needs  are  the  following: 

•  The  ability  to  handle  floating  point  data 

•  An  improved  and  more  complete  vector  capability 

•  An  improved  handling  of  aspatial  data. 

To  address  the  first  point  above,  the  handling  of  floating  point  data  could  be  accomplished  by 
modifying  the  current  raster  data  structure  to  handle  both  integer  and  floating  point  data.  The  specific 
methodology  involved  in  making  this  extension  would  probably  be  to  place  a  flag  in  the  raster  header  file 
designating  the  raster  map  as  type  integer  or  type  floating  point,  and  to  include  the  design  of  a  new 
storage  and  retrieval  function  for  floating  point  maps.  There  still  exists  the  problem  of  how  to  actually 
store  floating  point  numbers  in  a  machine-dependent  way.  This  problem  could  be  solved  by  selecting  the 
most  common  floating  point  format  and  providing  special  routines  for  converting  between  other  formats. 

To  address  the  second  point,  vector  capability  could  be  improved  to  include  the  kind  of  analysis 
capabilities  found  in  systems  such  as  ARC/INFO. 

The  third  point,  handling  aspatial  data,  is  a  twofold  problem.  First,  it  requires  the  improvement  of 
vector  capability  such  that  it  will  be  more  straightforward  to  handle  attributes  that  are  spatially  related  to 
a  map’s  features.  Secondly,  it  requires  some  mechanism  for  storing  and  relating  aspatial  data.  This 
second  task  has  traditionally  (and  correctly)  been  approached  by  integrating  commercial  DBMS  software 
such  as  Oracle  or  INGRESS.  It  might  be  advantageous  to  explore  integration  with  OODBMS  or  extended 
RDBMS  as  discussed  in  Chapter  2. 

This  implementation  strategy  still  has  several  drawbacks.  Existing  raster-based  GRASS  applications 
would  have  to  be  rewritten  to  accommodate  the  floating  point  data  type.  This  strategy  would  also  require 
modification  of  all  existing  GRASS  vector  software  and  creation  of  completely  new  vector  functionality. 
In  addition  to  these  drawbacks,  there  is  the  psychological  barrier  of  going  through  the  trouble  to 
implement  something  that  is  basically  already  available— or  available  with  much  less  effort— in  existing 
database  management  systems. 

Also,  this  approach  would  leave  the  GIS  developers  at  “Square  1”  in  making  engineering  modeling 
tasks  easier  to  implement.  Perhaps  the  most  bothersome  aspect  of  this  strategy  is  that  it  gives  no  attention 
to  the  projected  advances  in  hardware  technology.  To  accommodate  these  advances,  completely  new 
software  would  have  to  be  developed. 

Using  an  Extended  RDBMS  or  OODBMS  Core 

This  methodology  would  be  to  use  an  extensible  DBMS,  such  as  POSTGRES  or  a  public  domain 
OODBMS,  as  the  core  of  a  GIS  as  described  in  Chapter  2.  Using  a  DBMS  as  the  basis  for  the  GIS  would 
offer  all  of  the  advantages  inherent  in  these  systems: 

•  Better  support  for  complex  objects.  This  would  enhance  the  ability  to  model  physical 
systems,  as  discussed  in  Chapter  2  under  “Modeling  and  Analysis.” 

•  The  provision  of  extensible  data  types  (including  multiple  inheritance).  This  would  include 
the  ability  to  define  operators  and  access  methods  for  these  new  data  types.  It  would 
provide  the  mechanism  for  implementing  raster  and  vector  data  types  by  simply  extending 
existing  types.  It  would  also  provide  the  ability  to  create  specialized  data  types  based  on 
the  basic  geographic  types.  The  ability  to  define  spatial  operators  would  make  data 
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manipulation  easier  to  deal  with  and  also  could  provide  for  the  inclusion  of  high  level 
topological  relationships  such  as  “within”,  “beside”,  “above”,  or  “near”.  The  ability  to 
define  access  methods  would  allow  for  optimization  by  making  it  possible  to  define  various 
access  methods  that  might  prove  to  make  certain  tasks  simpler  to  implement. 

•  The  ability  to  have  active  databases  and  inferencing.  Both  of  these  features  would  add 
many  new  potential  capabilities  to  applications  that  would  be  developed  using  such  a 
system.  These  include  a  database  history  through  multiple  versions  of  data,  triggers,  and 
timers. 

•  The  ability  to  take  advantage  of  advances  in  hardware.  If  the  GIS  could  exploit 
technologies  such  as  optical  disks,  multiple  tightly-coupled  processors,  and  custom  chips, 
less  effort  in  this  direction  would  be  necessary  when  developing  new  applications  for  this 
system.  A  major  criterion  in  selecting  EDMBS  or  OODMBS  would  be  its  inherent  support 
for  advanced  hardware  technologies. 

•  Strong  support  for  aspatial  data.  This  is  inherent  in  the  relational  database  model. 

The  tasks  left  to  accomplish  would  be  to  define  the  new  spatial  types,  operators,  and  access  methods,  and 
to  implement  existing  tools  using  the  new  database.  The  authors  do  not  intend  to  minimize  the  difficulty 
in  completing  these  tasks,  but  the  points  outlined  above  represent  some  major  advantages  to  using  a  core 
DBMS  in  the  next  generation  of  GRASS. 


Research  Issues  for  a  DBMS-Based  GIS 

The  use  of  a  DBMS  as  the  basis  of  the  GIS  seems  to  be  the  best  way  to  provide  for  many  of  the 
shortcomings  of  today’s  systems,  while  reducing  the  effort  required  to  overcome  system  weaknesses  and 
take  advantage  of  hardware  that  is  on  its  way  from  the  laboratory  to  the  market. 

The  observation  and  judgments  presented  here  are  based  on  research  into  the  current  state  of  GIS 
technology  and  projected  developments  in  low-cost  workstation  hardware.  The  next  logical  step  would 
be  to  test  the  hypotheses  developed  here  by  implementing  a  prototype  system.  In  this  connection  there 
is  another  important  advantage  to  using  a  DBMS  as  the  core  of  the  GIS:  prototyping  is  facilitated  by  the 
DBMS  itself.  The  cost  of  aborting  the  idea  due  to  unforseen  circumstances  is  much  reduced.  The  authors 
believe  the  most  promising  approach  would  be  to  implement  a  prototype  using  an  object-oriented 
programming  language  and  an  OODBMS  in  order  to  take  advantage  of  the  efficiencies  associated  with 
object-oriented  programming.  Ultimately  the  research  must  determine  whether  an  OODBMS  would  or 
would  not  be  more  effective  as  the  core  of  the  GIS. 

Specific  research  questions  that  must  be  answered  early  in  the  research  and  development  process 
include  the  following: 

•  Does  an  OODBMS  provide  adequate  support  for  the  large  amounts  of  data  that  must  be  stored 
in  a  GIS? 

•  Does  an  extended  RDBMS  provide  adequate  support  for  large  amounts  of  GIS  data? 

•  What  is  the  best  way  to  deal  with  raster  (or  image)  data  within  an  object-oriented  framework? 
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•  Assuming  that  both  an  extended  RDBMS  and  an  OODBMS  can  support  the  data  requirements 
for  a  GIS,  what  are  the  advantages  and  disadvantages  of  each  with  regard  to  spatial  analysis  and 
modeling  tasks? 

•  How  do  OODBMS  and  extended  RDBMS  support  (or  fail  to  support)  projected  advances  in 
hardware  technology,  and  how  will  they  promote  or  hinder  the  use  of  these  technologies? 

•  How  do  extended  RDBMS  and  OODBMS  differ  in  their  ability  or  inability  to  support  raster 
databases? 

•  How  can  OODBMS  facilitate  the  representation  of  high  level  topological  relationships? 

These  questions  can  most  effectively  be  answered  through  development  of  a  DBMS-based  prototype 
for  USACERL’s  next-generation  GIS. 
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4  SUMMARY 


When  USACERL  created  GRASS  in  the  early  1980s,  the  GIS  was  designed  for  implementation  on 
hardware  that  would  be  available  up  to  10  years  into  the  future.  Subsequently,  the  system  was  widely 
accepted  among  workstation  users  both  in  the  public  and  private  sectors.  As  the  number  of  GRASS  users 
(and  uses)  grew,  new  demands  were  put  on  the  software,  some  of  which  highlighted  limitations  in  the 
current  system.  These  include: 

•  Lack  of  support  for  aspatial  data  associated  with  spatial  entities 

•  Lack  of  a  user-friendly  GUI 

•  Lack  of  vector  analysis  capability 

•  The  inability  to  store  anything  other  than  integer  data. 

Also,  hardware  development  has  proceeded  at  a  rapid  rate  since  GRASS  was  first  introduced.  GRASS 
is  not  able  to  take  advantage  of  extra  processing  power  available  in  new  hardware  now  coming  to  market. 

While  some  of  these  challenges  could  be  met  by  upgrading  the  current  system,  a  more  effective 
approach  would  be  to  design  a  new  generation  of  GRASS  that  addresses  these  issues  while  preserving  the 
successful  features  of  the  current  system. 

The  degree  to  which  these  issues  may  be  addressed  in  the  next  generation  of  GRASS  will  be  defined 
and  limited  by  underlying  data  structures  and  data  storage  techniques.  One  possible  design  for  the  next 
generation  of  GRASS  would  consist  of  a  “toolbox”  of  functional  modules  that  can  operate  on  specific  data 
types.  The  system  would  be  organized  into  layers  of  functionality  corresponding  to  the  domains  of  system 
programmer,  the  module  programmer,  the  applications  programmer,  and  the  GRASS  end  user.  Two 
possible  approaches  to  implementing  this  new  design  were  discussed: 

•  Development  “from  scratch”  of  a  specialized  storage  and  retrieval  system  exclusively  for 
geographic  data  types 

•  Building  the  required  GIS  functionalities  into  an  advanced,  customized  DBMS. 

While  the  first  strategy  has  been  employed  with  some  success  in  upgrades  of  other  GISs,  its  chief 
drawback  may  be  that  it  does  not  specifically  give  attention  to  projected  advances  in  hardware  technology. 

The  second  strategy  offers  inherent  benefits  that  meet  important  GRASS  development  needs  (c.g., 
strong  support  for  aspatial  data,  ability  to  exploit  new  hardware  technology).  The  authors  consider  the 
potential  problems  with  this  strategy  to  be  surmountable  through  research  and  development. 
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apE  animation  production  Environment 

API  application  programming  interface 

APP  application 

DB2  Database  2 
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EN  Environmental  Division 

ERDAS  Earth  Resources  Data  Analysis  System 
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GIS  geographic  information  system 

GRASS  Geographic  Resources  Analysis  Support  System 
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ISO-OSI  International  Standards  Organization  Open  Systems  Interconnection 

MIMD  multiple  instruction  multiple  data 

NF2  non-first  normal  form  (database) 

ODE  Object  Database  and  Environment 

OODBMS  object-oriented  database  management  system 


OSF  Open  Software  Foundation 
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PE  processing  element 

QI  query  interface 
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SIMD  single  instruction  multiple  data 
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